This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 



BEST AVAILABLE IMAGES 



Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 



BLACK BORDERS 

TEXT CUT OFF AT TOP, BOTTOM OR SIDES 
FADED TEXT 
ILLEGIBLE TEXT 
SKEWED/SLANTED IMAGES 
COLORED PHOTOS 

BLACK OR VERY BLACK AND WHITE DARK PHOTOS 
GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 




THIS PAGE BLANK 



(19) 



3 



Europaisches Patentamt 
Europ an Pat nt Office 
Office uropeen des brev ts 



I 



(12) 



(43) Date of publication: 

28.02.1996 Bulletin 1996/09 

(21) Application number: 95305193.5 

(22) Date of filing: 25.07.1995 



(n) EP 0 698 846 A1 

EUROPEAN PATENT APPLICATION 

(51) IntCI.S: G06F 9/38 



(84) Designated Contracting States: 


(72) Inventor: Yung, Robert 


DE FR GB NL 


Fremont, Califonia 94555 (US) 


(30) Priority: 24.08.1994 US 295126 


(74) Representative: Johnson, Terence Leslie 


London WC2A 1SD (GB) 


(71) Applicant: SUN MICROSYSTEMS, INC. 




Mountain View, CA 94043 (US) 





< 

CO 

oo 
oo 

CO 

o 

LU 



(54) Instruction result labeling in a counterflow pipeline processor 



(57) The present invention provides an efficient 
streamlined pipeline for a counterflow pipeline processor 
(110) with a renaming table (235). The counterflow pipe- 
line (110) includes an execution pipe (220) having mul- 
tiple instruction stages forming an instruction pipe (221 ), 
a plurality of result stages forming a result pipe, and a 
corresponding plurality of comparator/inserters. Each 
comparator/inserter couples an instruction stage to a 
corresponding result stage. The counterflow pipeline 
also includes a register exam stage with the renaming 
table (235). The renaming table has entries for associ- 
ating each register value of an instruction with a unique 
renamed register number (RRN), thereby eliminating the 
need for arbitration and housekeeping (killing of stale 
register values), as instructions and their respective reg- 
ister values counterflow in the streamlined counterflow 
pipeline. An RRN counter, such as a modulo counter, is 
coupled to the renaming table and provides unique 
RRNs for assignment to new register values. In accord- 
ance with one embodiment of the invention, instructions 
are decoded and unique RRNs assigned to the source 
and destination operand registers. If there is no previous 
RRN assigned to a register operand,, its register value is 
retrieved from a register file and inserted into the top of 
the result pipe. In addition, when an instruction execution 
produces a register result value in the execution pipe, 
the associated RRN and register value are inserted lat- 
erally into the result pipe. The register values and RRNs, 
in the form of result packages, are garnered by younger 
(later in program order) instructions counterflowing up 
the instruction pipe. 
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D scription 

RELATED APPLICATIONS 

U.S. Patent Application Serial No. 08/140 : 654, enti- 
tled "Counterflow Pipeline", filed 10/21/93, U.S. Patent 
Application Serial No. 08/140,655, entitled "Counterflow 
Pipeline Processor", filed 10/21/93, and U.S. Patent Ap- 
plication Serial No.08/208,526, entitled "Scoreboard Ta- 
ble for a Counterflow Pipeline Processor", filed 3/8/94, 
incorporated by reference herein, are assigned to Sun 
Microsystems, Inc., Mountain View CA., assignee of the 
present application. The above identified patent applica- 
tions are incorporated for the purpose of setting a stage 
for a discussion of the present invention and hence is not 
to be considered prior art nor an admission of prior art. 

FIELD OF THE INVENTION 

The present invention relates to computer systems, 
and more particularly: to a microprocessor having a 
counterflow pipeline and a renaming table. 

BACKGROUND OF THE INVENTION 

Microprocessors run user defined computer pro- 
grams to accomplish specific tasks. All programs, re- 
gardless of the task, are composed of a series of instruc- 
tions. State-of-the-art microprocessors execute instruc- 
tions using a multi-stage pipeline. Instructions enter at 
one end of the pipeline, are processed through the stag- 
es, and the results of the instructions exit at the opposite 
end of the pipeline. Typically, a pipelined processor in- 
cludes an instruction fetch stage, an instruction decode 
and register fetch stage, an^execution stage, a memory 
access stage and a write-back stage. The pipeline in- 
creases the number of instructions being executed si- 
multaneously, .and thus the overall processor throughput 
is improved. A superscalar processor is a processor that 
includes several pipelines arranged to execute several 
instructions in a parallel fashion. 

Control and data hazards are a problem with super- 
scalar pipelined processors. Control and data hazards 
occur when instructions are dependent upon one anoth- 
er. Consider a first pipeline executing a first instruction 
and the first instruction specifies a destination register 
(X). A second instruction, to be executed by a second 
pipeline, is said to be dependent if it needs the contents 
of register (X). If the second pipeline were to use the con- 
tents of register (X), prior to the completion of the first 
instruction, an incorrect outcome may be obtained be- 
cause the data in register (X) stored in a register file may 
be out-of-date, i.e., stale. Several approaches to avoid 
the data hazard problem are described in the above iden- 
tified pat nt applications and which describe respective- 
ly a pipeline design for counterflow pipelined processor, 
a microprocessor architecture based on the same, and 
a scoreboard for the same. 



The counterflow pipeline processor (CFPP) of the 
above identified applications depart from traditional pipe- 
line designs in that information flow is bi-directional. In- 
structions are stored in an instruction cache. These in- 

5 structions enter the counterflow pipeline in program or- 
der at a launch stage and proceed to a decoder for a 
determination of the instruction class, e.g., branch, load, 
add and multiply. Next, the instructions proceed to a reg- 
ister exam stage where the source and destination op- 

10 erand registers, if any, are identified and a retrieval of 
necessary source operand value(s) from a register file is 
initiated. These source operand value(s) are retrieved in 
one of several ways from the register file and inserted 
into the top of the result pipe. Alternatively, the operand 

*5 values can be transferred directly into the instructions. 

Next, the instructions in the form of instruction pack- 
ages enter and advance up the instruction pipe to be ex- 
ecuted. Subsequently, register values are generated for 
the destination register operands of the instruction pack- 

20 ages. These register values are inserted laterally into the 
respective result stages of the result pipe in the form of 
result packages which counterflow down the result pipe. 
As a younger (later in program order) instruction pack- 
age meets, a result package that is needed by that in- 

25 struction package, that register value is copied. This cop- 
ying process, which is referred to as "garnering", reduces 
the stall problem common with scalar pipeline proces- 
sors of the prior art. Hence, instruction packages flow up 
an instruction pipe of the counterflow pipeline while the 

30 register values from previous instruction packages flow 
down the result pipe of the same counterflow pipeline. 

Variations of the counterflow pipeline are possible. 
For example, the instruction pipe and the result pipe, 
which together forms an execution pipe, can be imple- 

35 mented to interoperate asynchronously. One drawback 
of such an asynchronous design is the requirement of 
complex arbitration and comparing logic coupled be- 
tween each instruction stage and corresponding result 
stages to guarantee that register value(s) do not over- 
do take any younger instruction packages requiring those 
result packages. The advance of instructions packages 
up the instruction pipe and counterflow of the- result 
packages down the result pipe must be properly arbitrat- 
ed by the complex arbitration and comparing logic for two 

45 important reasons. 

First, at every stage of the execution pipe, the arbi- 
tration and comparing logic ensures that a targeted result 
package does not overtake any younger instruction 
package requiring the corresponding targeted register 

so value for one of its source operands. This is accom- 
plished by ensuring that each required source register 
operand of an instruction package in an instruction stage 
is checked against any result package in a preceding re- 
sult stages, before the instruction package and the com- 

55 pared result package are allowed to pass each other in 
the execution pip . Arbitration at every stage of the ex- 
ecution pipe is time consuming and disrupts the concur- 
rency between instruction package flow and result pack- 



2 



3 



EP 0 698 846 A1 



4 



age flow in the execution pipe. 

Second, there is a need to prevent younger instruc- 
tions from garnering stale (expired) result packages. 
Stale result packages are those result packages with 
register values that have been superceded by new reg- 
ister values produced by younger instruction packages. 
Hence, upon a subsequent write to a destination oper- 
and register, younger instruction packages have the task 
of "killing", i.e., invalidating any stale result packages, as 
the younger instruction packages advance up the in- 
struction pipe. 

The above described arbitration of the counterflow 
pipeline ensures that instruction packages and their re- 
spective result packages counterflow in an orderly man- 
ner. However a typical execution pipe may be ten or 
more stages deep and the time penalty for arbitration can 
be substantial. Hence, there is a need for a more efficient 
counterflow pipeline architecture where the instruction 
and result packages can flow more concurrently, by elim- 
inating the need for arbitration for "killing" of stale register 
values. 

SUMMARY OF THE INVENTION 

The present invention provides an efficient stream- 
lined pipeline for a counterflow pipeline processor with a 
renaming table. The counterflow pipeline includes an ex- 
ecution pipe having multiple instruction stages forming 
an instruction pipe, a plurality of result stages forming a 
result pipe, and a corresponding plurality of compara- 
tor/inserters. Each comparator/ inserter couples an in- 
struction stage to a corresponding result stage. The 
counterflow pipeline also includes a register examination 
(exam) stage with the renaming table. The renaming ta- 
ble has assignment entries for associating register val- 
ues of instructions with unique register identifiers, e.g., 
renamed register numbers (RRNs). As a result, the reg- 
ister values are distinguishable from each other, thereby 
minimizing the need for complex arbitration and house- 
keeping (killing of stale register values), as younger (later 
in program order) instructions and their targeted register 
values counterflow in the streamlined counterflow pipe- 
line. A counter, such as a modulo counter, is coupled to 
the renaming table and provides unique register identi- 
fiers for new assignments. 

In one embodiment, an instruction advances up the 
counterflow pipeline until it reaches the register exam 
stage. For each source operand register that is not al- 
ready represented by an entry in the renaming table, the 
RRN counter assigns a unique RRN to the operand reg- 
ister. Conversely, if the instruction includes a destination 
operand register, a new RRN is assigned to the operand 
register. The RRN assignments are recorded as entries 
in the renaming table. These RRNs are also recorded in 
the respective source register RRN field(s) and/or des- 
tination register RRN field of a corresponding instruction 
package. 

Next, the instruction, in the form of the instruction 



package, enters the execution pipe and is processed by 
an instruction stage capable of executing the required 
operation. When the source operand register vaiue(s) 
have been garnered from the result pipe by matching the 

s respective RRN(s), the instruction package is executed. 
If the instruction package includes a destination operand 
register, the destination register value and associated 
RRN are inserted laterally into a result stage of the result 
pipe in the form of a result package. Subsequently, 

10 younger (later in program order) instruction packages 
are able to retrieve, i.e., garner, these targeted register 
value(s) solely by matching the respective unique RRNs 
of the targeted result packages counterflowing down the 
result pipe. 

is Note that prior to assigning a new RRN to a desti- 

nation operand register, the renaming table is scanned 
and any existing valid entry(s) matching the same oper- 
and register is invalidated or overwritten. By maintaining 
an up-to-date renaming table, the need for very tight ar- 

20 bitration between the instruction-pipe and the result pipe, 
required to kill stale results in the result pipe, is eliminat- 
ed. Instead, instruction and result packages can now ad- 
vance in the instruction and result pipes in a less inhibited 
manner. As a result, the throughput of the counterflow 

25 pipeline is substantially improved. 

The present invention also eliminates the need for 
flushing the result pipe whenever a "trap" or similar event 
occurs in the counterflow pipeline. Result pipe flushing 
is unnecessary because register values of operand reg- 

30 isters are associated with unique RRNs instead of 
non-unique register identifiers, and hence any stale reg- 
ister values remaining in the result pipe are simply ig- 
nored by younger instruction packages propagating in 
the execution pipe. 

35 

DESCRIPTION OF THE DRAWINGS 

The objects, features and advantages of the system 
of the present invention will be apparent from the follow- 
40 jng description in which: 

Figure 1 is a block diagram of a computer system 
of the present invention. 

Figure 2A is a block diagram illustrating one em- 
bodiment of a counterflow pipeline for a Counterflow 
45 Pipeline Processor (CFPP) in accordance with the inven- 
tion. 

Figure 2B is a block diagram of an execution pipe 
of the counterflow pipeline. 

Figure 3 shows one implementation of an instruc- 
50 tion package for the CFPP. 

Figure 4 shows one implementation of a result 
package for the CFPP. 

Figure 5 illustrates the format of a renaming table 
for the counterflow pipeline. 
55 Figur 6A through 6F illustrates the contents of 
the renaming table corresponding to the processing of 
assembler instructions 11 to 13. 

Figure 7A is a flow chart illustrating the processing 
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of instructions in the counterflow pipeline in accordance 
with the invention. 

Figure 7B is a flow chart illustrating assignments 
and subsequent use of renamed register numbers 
(RRNs) for source operand registers of the instructions. 

Figur 7C is a flowchart illustrating assignments 
and subsequent use of RRNs for destination operand 
registers of the instructions. 

Figure 7D is a flowchart illustrating the processing 
of instruction packages in the execution pipe of the coun- 
terflow pipeline. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

Referring to Figure 1 , a block diagram of a compu- 
ter system of the present invention is shown. The com- 
puter system 100 includes an input device 104, a display 
device 106 : an input/output (I/O) circuit 107, a system 
memory 108, and a central processing unit (CPU) 110. 
Since the basic operation of computer system 100 is well 
known, a detail description is not provided herein. In one 
embodiment, CPU 1 1 0 is based on a scalable processor 
architecture (SPARC). For more information on SPARC, 
including the SPARC instruction set, see for example 
"The SPARC Architecture Manual Version 9, SPARC 
International, 1993 : incorporated by reference herein. It 
should be noted that the present invention may be based 
on any microprocessor architecture and may use any 
type of instruction set, including complex (CISC) or re- 
duced (RISC) instructions sets. 

In accordance with one embodiment of the present 
invention as shown in Figure 2A, CPU 110 is a counter- 
flow pipeline processor (CFPP) and is provided an en- 
hanced pipeline 200 with a renaming table 235. En- 
hanced pipeline 200 includes an execution pipe 220, a 
register exam stage 230, a decoder 240, an instruction 
launch stage 250 and an instruction cache 260. Register 
exam stage 230 includes renaming table 235 and a 
counter 236. Execution pipe 220 is coupled to a register 
file 21 0. Register exam stage 230 is also coupled to reg- 
ister file 210. 

Figure 2B is a detailed block diagram of execution 
pipe 220 which includes an instruction pipe 221 with a 
plurality of instruction stages IS(1 ). IS(2), 13(3), ... fS(N), 
a result pipe 222 with a plurality of result stages RS(1), 
RS(2), RS(3) ... RS(N), and a comparator/inserter 223 
with a plurality of comparators/inserters Cl(1), Cl(2), CI 
(3), ... CI(N). Each instruction stage is coupled to a result 
stage by a corresponding comparator/inserter. For ex- 
ample, instruction stage IS(n) is coupled to result stages 
RS(n) by comparator/inserter Cl(n). In addition, register 
exam stage 230 includes a renamed register number 
(RRN) counter 236 which provides unique RRNs for re- 
naming table 235. In this embodiment, RRN counter 236 
is a modulo counter and a register file 210 has N registers 
R1, R2, ... RN. 

Referring to Figure 3, one implementation of an in- 



struction package used in CPU 110 is shown. Instruction 
package 300 includes an Operation (OP) Code field 331 , 
a non-unique first Source Register (SR) identifier field 
332, a first source value field 333, a first flag 334, a 

5 non-unique second SR identifier field 335, a second 
source value field 336, a second flag 337, a non-unique 
Destination Register (DR) identifier field 338, a result val- 
ue field 339, and a third flag 340. In addition, instruction 
package 300 includes three unique register identifier 

10 fields 332a, 335a and 338a. 

Instruction OP code field 331 holds the instruction 
of instruction package 300. Fields 332 and 335 each hold 
a non-unique register identifier for the source operand 
registers, such as a physical or virtual register number. 

is In the SPARC microprocessor environment, the physical 
register number (PRN) is derived by combining the vir- 
tual register number (VRN) with a register window point- 
er. Fields 333 and 336 hold the values for these source 
operand registers respectively. Flags 334 and 337 indi- 

20 cate if a particular instruction package 300 uses the SRs 
as identified in fields 332 and 335 and their validity, re- 
spectively. Field 338 holds the non-unique identifier for 
the DR, field 339 holds the value for that -destination reg- 
ister, and flag 340 indicates if the instruction package 300 

25 specifies a DR. 

In accordance with this embodiment of the invention, 
the unique register identifier fields are RRN fields, 
SR1RRN 332a, SR2RRN 335a and DRRRN 338a, the 
RRNs of first source register SR1 , second source regis- 

30 ter SR2 and destination register DR, respectively. As de- 
scribed tn greater detail below, RRN fields 332a,335a 
and 338a provide unique register identifiers for distin- 
guishing register values from each other. 

Referring to Figure 4, a corresponding implemen- 
ts tation of a result package used in CPU 110 is shown. 
Result package 400 includes a destination register iden- 
tifier field 452, a value field 453 and a flag 454. Field 452 
holds a unique register identifier for the DR. In this im- 
plementation, field 452 is a destination register RRN. 

40 The value field 453 holds the contents of the particular 
DR identified in field 452. Note that if an instruction pack- 
age does not identify a DR, a result package 450 is not 
created for that instruction. 

Figure 5 shows one implementation or renaming ta- 

4S ble 235 in greater detail. Renaming table 235 has 2*N 
entries and each entry includes three elements, a phys- 
ical register number, e.g., R1, a renamed register 
number (RRN) and a validity flag. As discussed above, 
execution pipe 200 has N stages. Hence, there can be 

so at most N destination operand registers in instruction 
pipe 221 and N register result values of result packages 
in result pipe 222, and so the maximum number of table 
entries is 2*N, i.e., the maximum range of values needed 
for the RRNs is modulo 2*N. Operation of renaming table 

55 235 is best described using an exemplary sequence of 
assembler instructions, such as the sequence of instruc- 
tions 11 through 13 listed below: 
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11 : 


ADD 2, R1, R2 


;; R2 


:= 2+ R1 


12: 


SUB R2, 5, R2 


;; R2 


:= R2 - 5 


13: 


: MUL R2 t R3, R4 


;; R4 


:= R2*R3 



Initially, enhanced counterf low pipeline 200 is empty 
and the register values of registers R1, R2, R3, R4 are 
stored in register file 210. All the entries of renaming ta- 
ble 235 are invalid as shown in Figure 6A. RRN counter 
236 is reset to a suitable starting value, such as "001 
Figures 6B through 6F illustrate the subsequent contents 
of renaming table 235 as instructions 11 to 13 are dis- 
patched by instruction launch stage 250 and eventually 
executed in instruction pipe 221. 

Specifically, Figure 6A shows the state of renaming 
table 235 prior to the examination of instruction 1 1 in reg- 
ister exam stage 230. Figure 6B shows the state of re- 
naming table 235 after the examination of the source op- 
erand registers of instruction 11, Figure 6C shows the 
state of renaming table 235 after instruction 11 has been 
examined but before the examination of instruction 12. 
Figure 6D illustrates table 235 after the examination of 
instruction !2. Figure 6E shows the state of renaming ta- 
ble 235 after the examination of the source operands of 
instruction 13. Finally, Figure 6D corresponds to the state 
of table 235 after all three instructions 11 , 12, 13 have been 
examined. 

Figure 7A-7D are flowcharts illustrating the process- 
ing of instructions in counterflow pipeline 200 in accord- 
ance with this embodiment of the present invention. Fig- 
ure 7A is a high level flowchart illustrating the processing 
of instructions in counterflow pipeline 200. Figure 7B is 
a detailed flowchart illustrating the assignments and use 
of RRNs for source operand registers of the instructions. 
Figure 7C is a detailed flowchart illustrating the assign- 
ments and use of RNNs for destination operand registers 
of the instructions. Figure 7D is a detailed flowchart illus- 
trating the processing of instruction packages in execu- 
tion pipe 220. Note that although the processing of 
source and destination registers of the instructions are 
described in sequential steps, concurrent handling of 
multiple operand registers is possible by trading circuit 
complexity for higher performance. 

Conceptually, as illustrated by Figure 7B, for each 
source operand register detected, renaming table 235 is 
scanned for an existing entry matching the source reg- 
ister (steps 731 and 732). If such an entry exists, the 
RRN associated with the source register is retrieved 
(step 733). Conversely, if no valid entry is found, a new 
RRN is assigned, the assignment recorded in renaming 
table 235, and a request for the register value of the 
source register is communicated to register file 210 
(steps 734, 735) In both cases, the RRN is written into 
the corresponding source register RRN field of the in- 
struction package (step 736). 

Referring now to Figure 7C : in the case of a desti- 
nation operand register, the renaming table is scanned 



for an existing entry matching the destination operand 
register (step 741 ). If such a valid entry is found, the ex- 
isting entry is invalidated or written over (step 743). Next, 
a new RRN is assigned to the destination operand reg- 
5 ister and the assignment recorded in renaming table 235 
(step 744). The RRN is also recorded into the corre- 
sponding destination register RRN field of the instruction 
package (step 745). Note that until an identical destina- 
tion operand register is rewritten by another instruction, 
10 the same RRN will be used for younger instructions hav- 
ing this destination operand register as one of its source 
operand register. 

The following is a detailed description of the opera- 
tion of counterflow pipeline 200 using assembler instruc- 
ts tions 11,12 and 13 as specific examples. Initially, instruc- 
tion 11 is fetched from instruction cache 260, and 
launched by instruction launch stage 250 into register 
exam stage 230 where it is decoded and examined for 
(non-unique) source and destination operand registers. 
20 (Steps 710 and 720 of Figure 7 A). CPU 110 scans re- 
naming table 235 for a valid entry associated with source 
operand register R1. (Step 731 of Figure 7B). Since all 
the entries of renaming table 235 are invalid as shown 
in Figure 6A : operand register R1 is assigned a new RRN 
25 "001 " by RRN counter 236 (step 734). As shown in the 
table of Figure 6B, the assignment of destination oper- 
and register R1 with RRN "001 " is recorded in a first entry 
of renaming table 235 which is marked "valid". Subse- 
quently, until operand register R1 is rewritten, younger 
30 instruction packages with register R1 as a source oper- 
and register will associate register R1 with RRN "001". 

Two steps occur next. First, a request for the register 
value of register R1 is communicated to register file 210 
(step 735). Second, the source operand RRN of R1 is 
35 also written into SR2RRN field 335a of a corresponding 
instruction 11 package (step 736). Many variations in 
transferring source register values from register file 210 
into the instruction packages are possible. In this embod- 
iment, the source register values are retrieved from reg- 
40 ister file 210 and inserted with their respective RRNs into 
result pipe 222 at result stage RS(N) in the form of result 
packages. In another embodiment, result pipe 222 is by- 
passed and source register values are transferred from 
register file 210 directly into the instruction packages. 
45 As shown in Figure 7C, a scan of renaming table 
235 is conducted for destination operand register R2 
(step 741 ). Since there is no valid entry matching desti- 
nation register R2, register R2 is assigned a new RRN 
of "010", obtained by incrementing RRN counter 236. Al- 
so ternatively, RRN counter 236 may be pre-incremented 
after each new RRN assignment. The assignment re- 
corded in a second entry of renaming table 235 as shown 
in Figure 6C (step 744). In addition, RRN "01 0" is record- 
ed into DRRRN field 338a of instruction 11 package (step 
55 745). 

Note that prior to an assignment of a new RRN to a 
destination operand register, CPU 110 scans renaming 
table 235 for an existing "valid" entry matching the des- 
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tination operand register (step 742). If a valid entry exist, 
the entry is marked "invalid" or written over with the new 
RRN assignment for the operand register (step 743). In- 
validating or writing over such existing valid entries of 
renaming table 235 associated with the destination op- 
erand register ensures that stale register values associ- 
ated with the destination operand register are not erro- 
neously retrieved from result pipe 222 by younger in- 
structions). Updating renaming table 235 is necessary 
since different result packages corresponding to an iden- 
tical operand register may be inserted into result pipe 
222 out-of-order different result stages. Such anomalies 
occur because although instruction packages enter ex- 
ecution pipe 220 in an original program order, these in- 
struction packages are processed concurrently by differ- 
ent instruction stages. of execution pipe 220. As a result, 
the execution of these instruction packages can com- 
plete in an order different from that of the original pro- 
gram order.. 

Instruction 11 package corresponding to instruction 
11 enters instruction stage IS(1) of instruction pipe 221 
and advances up instruction pipe 221 before reaching 
an available instruction stage capable of performing an 
"ADD" operation, e.g .. instruction stage IS(f). (Steps 750 
and 760 of Figure 7A). Meanwhile, the register value and 
its associated RRN n 00r are propagating down result 
pipe 222 towards result stage RS(f) which corresponds 
to instruction stage IS (f). This register value associated 
with RRN "001" is garnered via comparator/inserter CI 
(f) and stored in second source value field 336 (Step 761 
of Figure 7D). Instruction 11 package is executed, Le. : 
the integer value 2, which is stored in first source value 
field 333, is added to the garnered register value, which 
is stored in second source value field 336 (step 762). The 
resulting sum, i.e., the register value for destination reg- 
ister R2, is recorded in result value field 339 of the in- 
struction 11 package (step 763). This register value is 
also inserted laterally by comparator/inserter Cl(f) into 
result stage RS(f) together with RRN "002" (step 764). 
Subsequently, the register value of register R2 can be 
identified and garnered by younger instruction packages 
which have register R2 as one of its source operands as 
the younger instruction packages advance up instruction 
pipe 221. 

When instruction 1 2 reaches register exam stage 
230, register R2 is identified as one of the source oper- 
ands of instruction 12 (step 720). A scan of renaming ta- 
ble 235 in search of a valid entry matching register R2 is 
initiated (step 731). Assuming the sequence of assem- 
bler instructions 11 through 13 maintains a conventional 
(sequential) program order, only one valid entry match- 
ing register R2 should and does exist in renaming table 
235. As shown in Figure 6C, the second entry of renam- 
ing table 255 contains an association of operand register 
R2 with RRN "01 0 n . RRN "01 0 M is retrieved from renam- 
ing table 235 and recorded in the SR1RRN field 332a of 
a corresponding instruction 1 2 package (steps 733 and 
736). Later, instruction 12 package will be able to garner 



the source operand register value associated with RRN 
w 010 u as the corresponding result package counterflows 
down result pipe 222. 

Prior to assigning a new RRN for destination oper- 

s and register R2, CPU 110 scans renaming table 235 for 
any •Valid 0 entry(s) matching register R2 (step 741). As 
shown in Figure 6C, one valid entry matching register R2 
already exists. RRN counter 236 is incremented by one 
to produce a new RRN of P 0H", and a new association 

10 of register R2 with RRN "011 B recorded over the same 
(second) entry of renaming table 235, and the entry 
marked valid as shown in Figure 6D (steps 743 and 744). 
In addition, RRN "01 1 M is recorded in the DRRRN field 
338a of instruction 12 package (step 745). 

15 Instruction 1 2 package then enters instruction stage 
IS(1) (behind instruction 11) and advances up instruction 
pipe 221 before reaching an available instruction stage 
capable of performing the arithmetic operation subtract 
(SUB), e.g., instruction stage IS(g) (step 750). When all 

20 the source register operand values of instruction 12 (only 
register R2 in instruction 12) have been garnered., and 
stored in the appropriate value fields 333,336, instruction 
12 package is executed (step 761 ). The integer value "5" 
is subtracted from the garnered register value of source 

2S operand register R2 and a new register value computed 
for destination operand register R2 (step 762). This new 
register value is recorded in result value field 339 of in- 
struction 12 package (step 763). In addition, the register 
value for register R2, together with new RRN "011" are 

30 inserted laterally to result stage RS(g) by corresponding 
comparator/inserter Cl(g) (step 764). 

When instruction 1 3 enters register exam stage 230, 
registers R2 and R3 are identified as the source oper- 
ands (step 720). CPU 110 initiates a retrieval of the RRN 

3$ associated with source operand register R2, beginning 
with a scan of renaming table 235 for a valid entry match- 
ing register R2 (step 731 ). As shown in Figure 6D, RRN 
"011" associated with matched register R2 is retrieved 
from renaming table 235 and recorded into the SR1 RRN 

40 field 332a of a corresponding instruction 13 package 
(steps 733, 736). 

The scanning of renaming table 235 and the RRN 
retrieval process are attempted for source operand reg- 
ister R3 (step 731 ), with no valid entry matching register 

45 R3 found, as shown in Figure 6D. As shown in Figure 
6E, a new RRN "100 u is assigned to register R3 and re- 
corded in renaming table 235 (step 734). A request for 
the register value of operand register R3 is communicat- 
ed to register file 210 (step 735). The request results in 

so an insertion of a result package having RRN "lOO" and 
associated register value. RRN "lOO" is also recorded in 
SR2RRN field 335a of instruction 13 package (step 736). 

Next, CPU 110 scans renaming table 535 for any 
entry matching destination operand register R4 of in- 

ss struction 13 package (741). As shown in Figure 6E, no 
valid entry matching register R4 exists. CPU 110 incre- 
ments RRN counter 236 by one to generate a new RRN 
of "101". Using the fourth entry of renaming table 250, 
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an assignment of register R4 with RRN "101 " is record- 
ed, as shown in Figure 6F (step 744). In addition, RRN 
"101" is written into DRRRN field 338a of instruction !3 
package (step 745). 

The resulting instruction 13 package is inserted into 
execution pipe 220 at instruction stage IS(1) (step 750). 
Instruction 13 package then advances execution pipe 
220 before reaching an available instruction stage capa- 
ble of performing a multiply (MUL), e.g., instruction stage 
IS(h). Eventually, the result packages which includes 
RRN "011 " and "lOO", respectively, counterf low down re- 
sult pipe 222 to result stage RS(h). These two result 
packages are garnered (step 761) and the respective 
register values stored in source value fields 333, 336. 
When both source operand register values of instruction 
1 3 package have been garnered from result pipe 220, in- 
struction 13 package is executed. The respective re- 
trieved register values stored in source value fields 
333,336 are multiplied togetherto produce a register val- 
ue for destination operand result R4 (step 762). 

Trie-register result value is now recorded into result 
value field 339 of instruction 13 package (step 763). This 
register value and RRN "101", associated with destina- 
tion operand register R4, are also inserted into result 
stage RS(h) of result pipe 522 via comparator/ inserter 
CI (h) (step 764). Eventually, instruction 13 package prop- 
agates up instruction pipe 521 and exits at instruction 
stage IS(N). 

Note that whenever an instruction package exits in- 
struction stage IS(N), the non-unique register identifier, 
e.g., the physical register number, and the correspond- 
ing register value, stored in fields 338, 339, respectively, 
of the instruction package are used to update the corre- 
sponding register value(s) of register file 21 0. For exam- 
ple, in the case of instruction 13 package, non-unique 
register identifier R4 of instruction 13 package, and its 
associated register result value are used to update the 
register value of register R4 in register file 210. 

In accordance with another aspect of the present in- 
vention, whenever an instruction which causes a trap 
condition e.g., a result overflow, it is not necessary to kill 
all result packages associated with the same non-unique 
register identifier. This is because although multiple reg- 
ister values in result pipe 222 may have originated from 
an identical non-unique register identifier each corre- 
sponding result package in result pipe 222 includes a 
unique and hence distinguishable RRN. 

The above description of counterflow pipeline 200 
and renaming table 235 is merely illustrative of the 
present invention. For example, renaming table 235 may 
be implemented using dedicated hardware, such as as- 
sociative memory, a software table in system memory 
108 or a combination of both. Other variations are also 
possible. For example, unique RRN fields 332a, 335a : 
338a may replace non-unique register identifier fields 
332, 335, 338, respectively, upon entering execution 
pipe 220. Subsequently, non-unique fields 332, 335, 338 
are restored upon exiting execution pipe 220 and used 



to update register file 210. Hence, many modifications 
and/or additions by one skilled in the art are possible 
without deviating from the spirit of the present invention. 

5 

Claims 

1 . A pipeline processor comprising: 

an instruction pipe for executing a first instruc- 
io tion to produce a register value; and 

a renaming table coupled to said instruction 
pipe for recording an assignment of a unique register 
identifier to said register value. 

'5 2. The pipeline processor of claim 1 further comprising 
a result pipe for retaining said register value and said 
unique register identifier. 

3. The pipeline processor of claim 1 wherein said reg- 
20 ister value and said unique register identifier form a 

result package. 

4. The pipeline processor of claim 3 further comprising 
a comparatorfinserter coupled between said instruc- 
ts tion pipe and said result pipe for inserting said result 

package into said result pipe and for garnering said 
register value from said result pipe into said instruc- 
tion pipe for a younger instruction. 

30 5. The pipeline processor of claim 1 further comprising 
a counter for producing said unique register identi- 
fier. 

6. The pipeline processor of claim 5 wherein said 
35 unique register identifier is a renamed register 

number (RRN). 

7. The pipeline processor of claim 6 wherein said coun- 
ter is a modulo counter. 

40 

8. The pipeline processor of claim 2 wherein said 
instruction pipe and said result pipe are coupied so 
as to counterflow with respect to each other. 



45 9. A method of labeling register values of instructions 
useful in a pipeline processor having an instruction 
pipe and a result pipe, the method comprising the 
step of: 

assigning a unique register identifier to a des- 
so tination operand register of a first instruction in the 
processor. 

10. The method of claim 9 further including the steps of: 
associating said unique register identifier with 
55 a destination register value produced by an execu- 
tion of the first instruction in the instruction pipe; and 
inserting said unique register identifier and 
said associated destination register value into the 
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result pipe. 

11. The method of claim 10 further including the steps 
of: 

associating a source operand register of a 
younger instruction with said unique register identi- 
fier wherein said source operand register of the 
younger instruction is identical to said destination 
operand register of the first instruction; and 

garnering said register value associated with 
said unique register identifier from the result pipe for 
a source register value corresponding to said source 
operand register of the younger instruction. 

12. The method of claim 11 wherein said unique register 
identifier is a renamed register number (RRN) and 
said method further comprising the step of recording 
the assignment of said RRN to said destination oper- 
and register in a renaming table. 

13. The method of claim 12 further including a step of 
scanning said renaming table for an existing entry 
matching said destination operand register and 
invalidating any such existing entry prior to the 
recording step. 

14. The method of claim 12 wherein said step of asso- 
ciating said source operand register of the younger 
instruction includes a step of retrieving said RRN by 
scanning said renaming table for an entry matching 
said source operand register of the younger instruc- 
tion. 



the assignment of said RRN to said source operand 
register in the renaming table. 

20. The method of claim 1 9 further comprising the step 
s of communicating to a register file a request for a 

register value corresponding to said source operand 
register. 

21. The method of claim 20 further comprising the step 
10 of inserting said register value and said RRN into the 

result pipe. 

22. The method of claim 21 wherein said pipeline proc- 
essor is a counterflow pipeline processor and said 

75 method further comprises a step of counterflowing 
said register value and said RRN in the result pipe. 

23. The method of claim 22 further comprising a step of 
garnering said register value associated with said 

20 RRN from the result pipe into the instruction pipe. 



25 



30 



15. The method of claim 10 further comprising a step of 

updating a register file with said destination register 35 
value. 



16. The method of claim 14 wherein said pipeline proc- 
essor is a counterflow pipeline processor and said 
method further comprises a step of counterflowing 40 
said register value and said RRN in the result pipe 
prior to the garnering step. 

17. A method of labeling register values of instructions 
useful in a pipeline processbr having an instruction 
pipe and a result pipe, the method comprising the 
step of: 

assigning a unique register identifier to a 
source operand register of a first instruction in the 
processor. 50 

18. The method of claim 17 further comprising the step 
of scanning a renaming table for an entry matching 
said source operand register of the first instruction. 

55 

1 9. The method of claim 1 8 wherein said unique register 
identifier is a renamed register number (RRN) and 
said method further comprising the step of recording 



8 



EP 0 698 846 A1 



Input 
Device 
104 



I/O 
Circuit 
107 



Display 
Device 
106 



c 
i 


PU 
10 






System 
Memorv 
108 ' 



100 



FIG. 1 



9 



EP 0 698 846 A1 



Register File 



■210 



Execution 
Pipe 



200 



Register 
Exam Stasze 



Renaming Table 



RRN Counter 



n 



Decoder 



Instruction 
Launch Stage 

— r 



Instruction 
Cache 



260 



FIG. 2A 



10 



EP 0 698 846 A1 



221 



7 



223 



/ 



222 



/ 



IS(N) 



CI(N) 



RS(N) 



IS(N-1) 







CI(N-i) 















IS(N-2) 



; Instruction 
•Flow 



RS(N-1) 







CI(N-2) 















RS(N-2) 



220 



IS(n) 







CI(n) 















1 



RS(q) 



Result 
Flow 



IS(D 



CI(1) 



RS(l) 



FIG. 2B 



11 



EP 0 698 846 A1 



OP CODE 
331 



SRI 
332 



SR1RRN 
332a 



Source 
VALUE 
333 



F 




L 


SR2 


A 


335 


G 





SR2RRN 
335a 



Source 
VALUE 
336 



F 






L 


DR 


DRRRN 


A 


338 


338a 


G 







334 



337 



Result 
VALUE 
339 



P 
L 
A 

G 



1 

340 



300 



FIG. 3 







F 


RRN 


VALUE 


L 


452 


453 


A 

G 



400 



454 



FIG. 4 



12 



EP 0 698 846 A1 







Physical 
Register 
Number 
(PRN1 


Renamed 
Register 
Number 
(RRN) 


Status 


i 


{ 






Invalid 






-> 




Invalid 


2* 


N 












« • « 


* • • 


• • • 


\ 


t 






Invalid 



/ 

235 

FIG. 5 



13 



EP 0 698 846 A1 



/ 

235 



PRN 


RRN 


Status 






Invalid 






Invalid 






• 
• 






Invalid 



FIG. 6A 



235 



PRN 


RRN 


Status 


Rl 


001 - 


Valid 






Invalid 






Invalid 






• 
• 
* 






Invalid 



FIG. 6B 



14 



EP 0 698 846 A1 



235 



PRN 


RRN 


Status 


Rl 


001 


Valid 


R2 


010 


Valid 






Invalid 






• 
• 
• 






Invalid 



FIG. 6C 



15 



EP 0 698 846 A1 



235 



235 



PRN 


RRN 


Status 


Rl 


OOi 


Valid 


R2 


Oil 


Valid 






Invalid 






• 
• 
• 






Invalid 


FIG. 6D 


PRN 


RRN 


Status 


Rl 


001 


Valid 


R2 


on 


Valid 


R3 


100 


Valid 






Invalid 


















Invalid 



FIG. 6E 



16 



EP 0 698 846 A1 



235 



PRN 


RRN 


Status 


Rl 


001 


Valid 


R2 


on 


Valid 


R3 


100 


Valid 


R4 


101 


Valid 






Invalid 






• 
• 
• 






Invalid 



FIG. 6F 



17 



EP 0 698 846 A1 




Launch an Instruction 



Decode and Examine Source and Destination 
Operand Register(s) of the Instruction 

Process Source Operand Register(s) for an 
Instruction Package 

Process Destination Operand Register for 
the Instruction Package 

* 

Insert the Instruction Package into an 
Execution Pipe 

Execute the Instruction Package 

r 




FIG. 7A 



18 



EP 0 698 846 A1 



START 



Scan Renaming Table for a valid Entry 
corresponding to each unprocessed Source 
Operand Register. 



732 




731 

j 



No 



733 



Retrieve Renamed 
Register Number 
(RRN) associated 
with Source Operand 
Register. 



1 c 



734 



Assign a new Renamed Register 
Number (RRN) to Register Value 
and record assignment as a new 
Entry in Renaming Table. 



T 




Request Register Value associated 
with Source Operand Register 
from Register File. 



735 



FIG. 7B 



19 



EP 0 698 846 A1 




Scan Renaming Table for a vaiid Entry 
corresponding to Destination Operand Register 



742 



Vaiid 
Entry associated" 
mh Destination Operanc 
Register founds 



No 



7^4^ 



Assign a new Renamed 
Register Number 
(RRN) and record 
assignment in the 
Renaming Table 




Write RNN Lnto 
DRRRN field of 
Instruction Package 

i 



Yes 



1 



Mark Entry 
invalid or write 
over Entry 



i 



FIG. 7C 



20 



EP 0 698 846 A1 



(^^START 

Garner Source Operand Register vaiue(s) from Result Pipe 
by matching RRN(s) and store register values in 
Instruction Package. , 

, I 

Execute Instruction Package to Produce a 
Result Value for Destination Operand Register 

i : 

Record Result Value of Destination Operand Register 
into Instruction Package 

I 

Insert Result Value and RRN associated with the Destination 
Operand Register laterally into Result Pipe 



FIG. 7D 





21 



EP 0 698 846 A1 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 95 30 5193 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citatum of document with indication, where appropriate, 
of rdevaBt i 



Relevant 

to I 



CLASSIFICATION OF THE 
APPLICATION (Int.CL6) 



X 

A 



EP-A-0 600 611 (IBM) 8 June 1994 

* page 4, line 45 - page 5, line 41; 
figures 2,3 * 

EP-A-0 432 774 (HYUNDAI ELECTRONICS 
AMERICA) 19 June 1991 

* page 6, 

* page 8, 



1,3,5-7 
2,9-15, 
17-21 

2,9-15, 
17-21 



G06F9/38 



line 39 - page 7, line 27 
line 23 - line 25 * 



* page 8, 

* page 9, 



1 ine 45 - 1 ine 57 
line 23 - page 11, 



line 27 



IEEE MICRO, V / . 

vol. 11, no. 3, 1 JunV"lS91 
pages 10-13, 63 - 73, XP 000237231 
P0PESCU V ET AL 'THE METAFL0W 
ARCHITECTURE* 

* page 11 --; page 13 * 

* page 63 - page 65, left column * 

COMPUTER ARCHITECTURE NEWS, 

vol . 20, no. 2, 1 May 1992 

pages 58-67, XP 000277754 

FRANKLIN M ET AL 'THE EXPANDABLE SPLIT 

WINDOW PARAOIGM FOR EXPLOITING FINE-GRAIN 

PARALLELISM 1 

* page 63: 11 Forwarding of Register 
Values" * 

EP-A-0 301 220 (IBM) 1 February 1989 

* claim 1 * 

EP-A-0 514 763 (MOTOROLA INC) 25 November 
1992 



1,9-15, 
17-21 



8,16,22 



The present search report has been drawn up for all claims 



TECHNICAL FIELDS 
SEARCHED (fert-CJ.6) 



G06F 



Place mf teatc* 

THE HAGUE 



Date of mtwrthwi «f tat tamtk 

11 December 1995 



Daskalakis, T 



CATEGORY OF CITED DOCUMENTS 

X : particularly reteraat if taken alone 

V : particularly relevant if combined with another 

document of the same category 
A : technological backcroand 
O : no n w ritte n disdosare 
P : intermediate document 



T : theory or principle underlying the invention 
E : earlier patent document, but published on, or 

after the filing date 
D : document cited in the application 
L : document cited for other reasons 



& : member of the same patent family, corresponding 



22 



